Performance of a SCFG-Based Language Model with Training Data Sets of Increasing Size

نویسندگان

Joan-Andreu Sánchez

José-Miguel Benedí

Diego Linares

چکیده

In this paper, a hybrid language model which combines a word-based n-gram and a category-based Stochastic Context-Free Grammar (SCFG) is evaluated for training data sets of increasing size. Different estimation algorithms for learning SCFGs in General Format and in Chomsky Normal Form are considered. Experiments on the UPenn Treebank corpus are reported. These experiments have been carried out in terms of the test set perplexity and the word error rate in a speech recognition experiment.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MAN-MACHINE INTERACTION SYSTEM FOR SUBJECT INDEPENDENT SIGN LANGUAGE RECOGNITION USING FUZZY HIDDEN MARKOV MODEL

Sign language recognition has spawned more and more interest in human–computer interaction society. The major challenge that SLR recognition faces now is developing methods that will scale well with increasing vocabulary size with a limited set of training data for the signer independent application. The automatic SLR based on hidden Markov models (HMMs) is very sensitive to gesture's shape inf...

متن کامل

Shannon’s Entropy of The Stochastic Context-Free Grammar and an Application to RNA Secondary Structure Modeling

Stochastic context-free grammars (SCFG) have been used in RNA Secondary structure modeling. An SCFG consists of a set of grammar rules with probability for each. Given a grammar design, finding the best set of probabilities that yield optimum performance can be challenging. Although current Expectation Maximization (EM) MaximumLikelihood (ML)-based model training approaches have been effective,...

متن کامل

Estimation of the mean grain size of mechanically induced Hydroxyapatite based bioceramics via artiﬁcial neural network

This study focuses on the estimation of the mean grain size of mechanically induced Hydroxyapatite (HA) through the artiﬁcial neural network (ANN) model. The mean grain size of HA and HA based nanocomposites at different milling parameters were obtained from previous studies. The data were trained and tested by the neural network modeling. Accordingly, all data (55 sets) were based on the mecha...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2005

Performance of a SCFG-Based Language Model with Training Data Sets of Increasing Size

نویسندگان

چکیده

منابع مشابه

MAN-MACHINE INTERACTION SYSTEM FOR SUBJECT INDEPENDENT SIGN LANGUAGE RECOGNITION USING FUZZY HIDDEN MARKOV MODEL

Shannon’s Entropy of The Stochastic Context-Free Grammar and an Application to RNA Secondary Structure Modeling

Estimation of the mean grain size of mechanically induced Hydroxyapatite based bioceramics via artiﬁcial neural network

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

عنوان ژورنال:

اشتراک گذاری